Lecture 10 – March 2 Lecturer : Prof Lillian Lee Scribes : Jerzy Hausknecht & Kent Sutherland More on Language Models
نویسنده
چکیده
In the previous lecture, we discussed the idea of relevance models, as presented in [Lavrenko & Croft 01]. For each query, a language model for relevance is constructed. The final product is a language model based on a collection of documents. The final model estimation details were very similar to query likelihood, even though the relevance model was derived from the ideas in [Robertson & Spärck Jones 76]. The relevance model work reinforces the “importance of being reversed” (Lafferty and Zhai’s phrasing), as assigning likelihood to a query from a document-based model yields better statistical estimates than assigning likelihood to a document based on a query-based model. While such an approach to deriving language models “works” mathematically, the intuition for it severely stretches the original assumptions made in the Robertson & Spärck Jones approach. This discussion starts with a mostly clean slate and attempts to justify the language model approach to query likelihood in a more intuitive fashion.
منابع مشابه
Distinguished Lecturer Tour of Norman C. Beaulieu in Skopje, Florence, Podgorica, and Belgrade in March 2014 [Global Communications Newsletter]
The idea for Prof. Norman Beaulieu’s DL Tour was raised during the Globecom 2013 conference in December 2013 in Atlanta, when Prof. Beaulieu and Prof. Zoran HadziVelkov, the Chair of the R. Macedonia ComSoc Chapter, first discussed it. The Distinguished Lecturer Tour was organized at the beginning of 2014 to include four different European countries: Republic of Macedonia, Italy, Montenegro, an...
متن کاملCS 630 Notes : Lecture 5 Lecturer
1 Review of Classic Probabilistic Retrieval Model Previously we modeled the problem of retrieval as follows: we will try to calculate the probability P (relevent|doc) given a fixed query q. We then proceeded with the following steps. 1. To make sense of the original proposition, we converted doc d to an attribute vector, where attributes are “kind of” based on terms: doc d→ ~a (d) = (a1 (d) , ....
متن کاملThe Singular Value Decomposition 4 / 4 / 06 Lecturer : Lillian
In today’s lecture we will finally state the Singular Value Decompotion ‘theorem’. To build some intuition for it, we will continue exploring the underlying geometric interpretations of Matrix Theoretic Corpus Characterizations. We wish to develop a general way of succinctly describing properties inherent in some corpora using matrices. Since we already know of a vector space representation for...
متن کاملCS 6740 : Advanced Language Technologies February 4 , 2010 Lecture 3 : Pivoted Document Length Normalization
In this lecture, we examine the impact of the length of a document on its relevance to queries. We show that document relevance is positively correlated with document length, and see that relevance scores that use the normalization techniques we’ve studied so far (L∞, L1, L2) do not capture this correlation correctly. Finally, we present the “pivoted document length normalization” technique int...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010